Overview

Dataset statistics

Number of variables25
Number of observations8399
Missing cells966
Missing cells (%)0.5%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory1.6 MiB
Average record size in memory200.0 B

Variable types

Text3
Numeric11
Categorical9
DateTime2

Alerts

number_of_records has constant value ""Constant
order_id is highly overall correlated with row_idHigh correlation
row_id is highly overall correlated with order_idHigh correlation
sales is highly overall correlated with shipping_cost and 1 other fieldsHigh correlation
shipping_cost is highly overall correlated with sales and 2 other fieldsHigh correlation
unit_price is highly overall correlated with sales and 1 other fieldsHigh correlation
zip_code is highly overall correlated with region and 1 other fieldsHigh correlation
product_category is highly overall correlated with product_sub_categoryHigh correlation
product_container is highly overall correlated with product_sub_category and 1 other fieldsHigh correlation
product_sub_category is highly overall correlated with product_category and 2 other fieldsHigh correlation
region is highly overall correlated with zip_code and 1 other fieldsHigh correlation
ship_mode is highly overall correlated with shipping_cost and 2 other fieldsHigh correlation
state is highly overall correlated with zip_code and 1 other fieldsHigh correlation
customer_age has 903 (10.8%) missing valuesMissing
row_id is uniformly distributedUniform
row_id has unique valuesUnique
discount has 756 (9.0%) zerosZeros

Reproduction

Analysis started2023-10-22 10:05:24.206409
Analysis finished2023-10-22 10:05:44.159187
Duration19.95 seconds
Software versionydata-profiling vv4.6.0
Download configurationconfig.json

Variables

city
Text

Distinct1421
Distinct (%)16.9%
Missing0
Missing (%)0.0%
Memory size65.7 KiB
2023-10-22T12:05:44.336073image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/

Length

Max length19
Median length16
Mean length9.1384689
Min length3

Characters and Unicode

Total characters76754
Distinct characters52
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique151 ?
Unique (%)1.8%

Sample

1st rowMcKeesport
2nd rowBowie
3rd rowNapa
4th rowMontebello
5th rowNapa
ValueCountFrequency (%)
city 266
 
2.4%
park 165
 
1.5%
beach 116
 
1.0%
west 112
 
1.0%
heights 107
 
1.0%
san 97
 
0.9%
north 89
 
0.8%
lake 88
 
0.8%
saint 85
 
0.8%
hills 73
 
0.7%
Other values (1377) 9992
89.3%
2023-10-22T12:05:44.824348image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 7209
 
9.4%
a 6797
 
8.9%
n 5679
 
7.4%
o 5628
 
7.3%
l 5073
 
6.6%
r 4939
 
6.4%
i 4617
 
6.0%
t 3962
 
5.2%
s 3383
 
4.4%
2791
 
3.6%
Other values (42) 26676
34.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 62773
81.8%
Uppercase Letter 11190
 
14.6%
Space Separator 2791
 
3.6%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 7209
11.5%
a 6797
10.8%
n 5679
9.0%
o 5628
9.0%
l 5073
 
8.1%
r 4939
 
7.9%
i 4617
 
7.4%
t 3962
 
6.3%
s 3383
 
5.4%
d 1997
 
3.2%
Other values (16) 13489
21.5%
Uppercase Letter
ValueCountFrequency (%)
C 1214
 
10.8%
S 1001
 
8.9%
P 952
 
8.5%
M 851
 
7.6%
B 851
 
7.6%
H 707
 
6.3%
L 688
 
6.1%
W 555
 
5.0%
R 548
 
4.9%
A 472
 
4.2%
Other values (15) 3351
29.9%
Space Separator
ValueCountFrequency (%)
2791
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 73963
96.4%
Common 2791
 
3.6%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 7209
 
9.7%
a 6797
 
9.2%
n 5679
 
7.7%
o 5628
 
7.6%
l 5073
 
6.9%
r 4939
 
6.7%
i 4617
 
6.2%
t 3962
 
5.4%
s 3383
 
4.6%
d 1997
 
2.7%
Other values (41) 24679
33.4%
Common
ValueCountFrequency (%)
2791
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 76754
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 7209
 
9.4%
a 6797
 
8.9%
n 5679
 
7.4%
o 5628
 
7.3%
l 5073
 
6.6%
r 4939
 
6.4%
i 4617
 
6.0%
t 3962
 
5.2%
s 3383
 
4.4%
2791
 
3.6%
Other values (42) 26676
34.8%

customer_age
Real number (ℝ)

MISSING 

Distinct48
Distinct (%)0.6%
Missing903
Missing (%)10.8%
Infinite0
Infinite (%)0.0%
Mean54.542823
Minimum41
Maximum95
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size65.7 KiB
2023-10-22T12:05:45.040741image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/

Quantile statistics

Minimum41
5-th percentile42
Q147
median53
Q361
95-th percentile71
Maximum95
Range54
Interquartile range (IQR)14

Descriptive statistics

Standard deviation9.5194352
Coefficient of variation (CV)0.1745314
Kurtosis-0.072222184
Mean54.542823
Median Absolute Deviation (MAD)7
Skewness0.6106657
Sum408853
Variance90.619647
MonotonicityNot monotonic
2023-10-22T12:05:45.256206image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
Histogram with fixed size bins (bins=48)
ValueCountFrequency (%)
46 413
 
4.9%
41 356
 
4.2%
47 330
 
3.9%
51 329
 
3.9%
56 310
 
3.7%
44 304
 
3.6%
48 299
 
3.6%
55 293
 
3.5%
50 281
 
3.3%
42 276
 
3.3%
Other values (38) 4305
51.3%
(Missing) 903
 
10.8%
ValueCountFrequency (%)
41 356
4.2%
42 276
3.3%
43 263
3.1%
44 304
3.6%
45 252
3.0%
46 413
4.9%
47 330
3.9%
48 299
3.6%
49 203
2.4%
50 281
3.3%
ValueCountFrequency (%)
95 8
 
0.1%
93 1
 
< 0.1%
88 7
 
0.1%
86 19
0.2%
85 2
 
< 0.1%
84 1
 
< 0.1%
82 8
 
0.1%
81 11
0.1%
80 3
 
< 0.1%
79 21
0.3%
Distinct795
Distinct (%)9.5%
Missing0
Missing (%)0.0%
Memory size65.7 KiB
2023-10-22T12:05:45.560781image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/

Length

Max length22
Median length19
Mean length12.867127
Min length7

Characters and Unicode

Total characters108071
Distinct characters53
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique6 ?
Unique (%)0.1%

Sample

1st rowJessica Myrick
2nd rowMatt Collister
3rd rowAlan Schoenberger
4th rowElizabeth Moffitt
5th rowAlan Schoenberger
ValueCountFrequency (%)
michael 105
 
0.6%
john 93
 
0.6%
brown 93
 
0.6%
liz 87
 
0.5%
michelle 86
 
0.5%
jones 86
 
0.5%
patrick 83
 
0.5%
bill 80
 
0.5%
alan 77
 
0.5%
price 75
 
0.4%
Other values (895) 15961
94.9%
2023-10-22T12:05:46.079770image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 10075
 
9.3%
e 9451
 
8.7%
n 8461
 
7.8%
8427
 
7.8%
r 7919
 
7.3%
i 6569
 
6.1%
l 5720
 
5.3%
o 5302
 
4.9%
t 4465
 
4.1%
s 3528
 
3.3%
Other values (43) 38154
35.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 82315
76.2%
Uppercase Letter 17202
 
15.9%
Space Separator 8427
 
7.8%
Other Punctuation 127
 
0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
C 1563
 
9.1%
B 1510
 
8.8%
S 1487
 
8.6%
M 1487
 
8.6%
D 1121
 
6.5%
J 1009
 
5.9%
A 988
 
5.7%
P 880
 
5.1%
H 796
 
4.6%
T 784
 
4.6%
Other values (16) 5577
32.4%
Lowercase Letter
ValueCountFrequency (%)
a 10075
12.2%
e 9451
11.5%
n 8461
10.3%
r 7919
9.6%
i 6569
 
8.0%
l 5720
 
6.9%
o 5302
 
6.4%
t 4465
 
5.4%
s 3528
 
4.3%
h 3269
 
4.0%
Other values (15) 17556
21.3%
Space Separator
ValueCountFrequency (%)
8427
100.0%
Other Punctuation
ValueCountFrequency (%)
' 127
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 99517
92.1%
Common 8554
 
7.9%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 10075
 
10.1%
e 9451
 
9.5%
n 8461
 
8.5%
r 7919
 
8.0%
i 6569
 
6.6%
l 5720
 
5.7%
o 5302
 
5.3%
t 4465
 
4.5%
s 3528
 
3.5%
h 3269
 
3.3%
Other values (41) 34758
34.9%
Common
ValueCountFrequency (%)
8427
98.5%
' 127
 
1.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 108071
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 10075
 
9.3%
e 9451
 
8.7%
n 8461
 
7.8%
8427
 
7.8%
r 7919
 
7.3%
i 6569
 
6.1%
l 5720
 
5.3%
o 5302
 
4.9%
t 4465
 
4.1%
s 3528
 
3.3%
Other values (43) 38154
35.3%

customer_segment
Categorical

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size65.7 KiB
Corporate
3076 
Home Office
2032 
Consumer
1649 
Small Business
1642 

Length

Max length14
Median length11
Mean length10.265032
Min length8

Characters and Unicode

Total characters86216
Distinct characters20
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowSmall Business
2nd rowHome Office
3rd rowCorporate
4th rowConsumer
5th rowCorporate

Common Values

ValueCountFrequency (%)
Corporate 3076
36.6%
Home Office 2032
24.2%
Consumer 1649
19.6%
Small Business 1642
19.5%

Length

2023-10-22T12:05:46.279944image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-10-22T12:05:46.461583image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
ValueCountFrequency (%)
corporate 3076
25.5%
home 2032
16.8%
office 2032
16.8%
consumer 1649
13.7%
small 1642
13.6%
business 1642
13.6%

Most occurring characters

ValueCountFrequency (%)
e 10431
12.1%
o 9833
 
11.4%
r 7801
 
9.0%
s 6575
 
7.6%
m 5323
 
6.2%
C 4725
 
5.5%
a 4718
 
5.5%
f 4064
 
4.7%
3674
 
4.3%
i 3674
 
4.3%
Other values (10) 25398
29.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 70469
81.7%
Uppercase Letter 12073
 
14.0%
Space Separator 3674
 
4.3%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 10431
14.8%
o 9833
14.0%
r 7801
11.1%
s 6575
9.3%
m 5323
7.6%
a 4718
6.7%
f 4064
 
5.8%
i 3674
 
5.2%
u 3291
 
4.7%
n 3291
 
4.7%
Other values (4) 11468
16.3%
Uppercase Letter
ValueCountFrequency (%)
C 4725
39.1%
O 2032
16.8%
H 2032
16.8%
S 1642
 
13.6%
B 1642
 
13.6%
Space Separator
ValueCountFrequency (%)
3674
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 82542
95.7%
Common 3674
 
4.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 10431
12.6%
o 9833
11.9%
r 7801
 
9.5%
s 6575
 
8.0%
m 5323
 
6.4%
C 4725
 
5.7%
a 4718
 
5.7%
f 4064
 
4.9%
i 3674
 
4.5%
u 3291
 
4.0%
Other values (9) 22107
26.8%
Common
ValueCountFrequency (%)
3674
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 86216
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 10431
12.1%
o 9833
 
11.4%
r 7801
 
9.0%
s 6575
 
7.6%
m 5323
 
6.2%
C 4725
 
5.5%
a 4718
 
5.5%
f 4064
 
4.7%
3674
 
4.3%
i 3674
 
4.3%
Other values (10) 25398
29.5%

discount
Real number (ℝ)

ZEROS 

Distinct16
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.049671389
Minimum0
Maximum0.25
Zeros756
Zeros (%)9.0%
Negative0
Negative (%)0.0%
Memory size65.7 KiB
2023-10-22T12:05:46.630936image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10.02
median0.05
Q30.08
95-th percentile0.1
Maximum0.25
Range0.25
Interquartile range (IQR)0.06

Descriptive statistics

Standard deviation0.03182302
Coefficient of variation (CV)0.64067102
Kurtosis-0.95941106
Mean0.049671389
Median Absolute Deviation (MAD)0.03
Skewness0.073916963
Sum417.19
Variance0.0010127046
MonotonicityNot monotonic
2023-10-22T12:05:46.809017image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
Histogram with fixed size bins (bins=16)
ValueCountFrequency (%)
0.01 806
9.6%
0.05 786
9.4%
0.03 779
9.3%
0.09 778
9.3%
0.04 770
9.2%
0.08 765
9.1%
0.02 765
9.1%
0 756
9.0%
0.1 745
8.9%
0.06 734
8.7%
Other values (6) 715
8.5%
ValueCountFrequency (%)
0 756
9.0%
0.01 806
9.6%
0.02 765
9.1%
0.03 779
9.3%
0.04 770
9.2%
0.05 786
9.4%
0.06 734
8.7%
0.07 710
8.5%
0.08 765
9.1%
0.09 778
9.3%
ValueCountFrequency (%)
0.25 1
 
< 0.1%
0.21 1
 
< 0.1%
0.17 1
 
< 0.1%
0.16 1
 
< 0.1%
0.11 1
 
< 0.1%
0.1 745
8.9%
0.09 778
9.3%
0.08 765
9.1%
0.07 710
8.5%
0.06 734
8.7%

number_of_records
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size65.7 KiB
1
8399 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters8399
Distinct characters1
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
1 8399
100.0%

Length

2023-10-22T12:05:46.980551image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-10-22T12:05:47.122584image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
ValueCountFrequency (%)
1 8399
100.0%

Most occurring characters

ValueCountFrequency (%)
1 8399
100.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 8399
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 8399
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 8399
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 8399
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 8399
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 8399
100.0%
Distinct1418
Distinct (%)16.9%
Missing0
Missing (%)0.0%
Memory size65.7 KiB
Minimum2012-01-01 00:00:00
Maximum2015-12-30 00:00:00
2023-10-22T12:05:47.289672image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-22T12:05:47.514191image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

order_id
Real number (ℝ)

HIGH CORRELATION 

Distinct5496
Distinct (%)65.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean29965.18
Minimum3
Maximum59973
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size65.7 KiB
2023-10-22T12:05:47.726291image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/

Quantile statistics

Minimum3
5-th percentile2818
Q115011.5
median29857
Q344596
95-th percentile57061
Maximum59973
Range59970
Interquartile range (IQR)29584.5

Descriptive statistics

Standard deviation17260.883
Coefficient of variation (CV)0.57603137
Kurtosis-1.1783167
Mean29965.18
Median Absolute Deviation (MAD)14778
Skewness0.0038108922
Sum2.5167754 × 108
Variance2.979381 × 108
MonotonicityNot monotonic
2023-10-22T12:05:47.938206image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
43745 6
 
0.1%
24132 6
 
0.1%
43875 5
 
0.1%
57253 5
 
0.1%
59781 5
 
0.1%
12067 5
 
0.1%
52896 5
 
0.1%
33797 5
 
0.1%
43488 5
 
0.1%
8995 5
 
0.1%
Other values (5486) 8347
99.4%
ValueCountFrequency (%)
3 1
 
< 0.1%
6 1
 
< 0.1%
32 4
< 0.1%
35 2
< 0.1%
36 1
 
< 0.1%
65 1
 
< 0.1%
66 1
 
< 0.1%
69 2
< 0.1%
70 2
< 0.1%
96 1
 
< 0.1%
ValueCountFrequency (%)
59973 2
< 0.1%
59971 3
< 0.1%
59969 2
< 0.1%
59943 1
 
< 0.1%
59942 1
 
< 0.1%
59939 1
 
< 0.1%
59937 1
 
< 0.1%
59911 1
 
< 0.1%
59909 2
< 0.1%
59906 1
 
< 0.1%

order_priority
Categorical

Distinct5
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size65.7 KiB
High
1768 
Low
1720 
Not Specified
1672 
Medium
1631 
Critical
1608 

Length

Max length13
Median length6
Mean length6.7410406
Min length3

Characters and Unicode

Total characters56618
Distinct characters23
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowHigh
2nd rowNot Specified
3rd rowLow
4th rowCritical
5th rowLow

Common Values

ValueCountFrequency (%)
High 1768
21.1%
Low 1720
20.5%
Not Specified 1672
19.9%
Medium 1631
19.4%
Critical 1608
19.1%

Length

2023-10-22T12:05:48.137129image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-10-22T12:05:48.308007image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
ValueCountFrequency (%)
high 1768
17.6%
low 1720
17.1%
not 1672
16.6%
specified 1672
16.6%
medium 1631
16.2%
critical 1608
16.0%

Most occurring characters

ValueCountFrequency (%)
i 9959
17.6%
e 4975
 
8.8%
o 3392
 
6.0%
d 3303
 
5.8%
t 3280
 
5.8%
c 3280
 
5.8%
H 1768
 
3.1%
g 1768
 
3.1%
h 1768
 
3.1%
L 1720
 
3.0%
Other values (13) 21405
37.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 44875
79.3%
Uppercase Letter 10071
 
17.8%
Space Separator 1672
 
3.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i 9959
22.2%
e 4975
11.1%
o 3392
 
7.6%
d 3303
 
7.4%
t 3280
 
7.3%
c 3280
 
7.3%
g 1768
 
3.9%
h 1768
 
3.9%
w 1720
 
3.8%
f 1672
 
3.7%
Other values (6) 9758
21.7%
Uppercase Letter
ValueCountFrequency (%)
H 1768
17.6%
L 1720
17.1%
S 1672
16.6%
N 1672
16.6%
M 1631
16.2%
C 1608
16.0%
Space Separator
ValueCountFrequency (%)
1672
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 54946
97.0%
Common 1672
 
3.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
i 9959
18.1%
e 4975
 
9.1%
o 3392
 
6.2%
d 3303
 
6.0%
t 3280
 
6.0%
c 3280
 
6.0%
H 1768
 
3.2%
g 1768
 
3.2%
h 1768
 
3.2%
L 1720
 
3.1%
Other values (12) 19733
35.9%
Common
ValueCountFrequency (%)
1672
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 56618
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
i 9959
17.6%
e 4975
 
8.8%
o 3392
 
6.0%
d 3303
 
5.8%
t 3280
 
5.8%
c 3280
 
5.8%
H 1768
 
3.1%
g 1768
 
3.1%
h 1768
 
3.1%
L 1720
 
3.0%
Other values (13) 21405
37.8%

order_quantity
Real number (ℝ)

Distinct50
Distinct (%)0.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean25.571735
Minimum1
Maximum50
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size65.7 KiB
2023-10-22T12:05:48.517288image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile3
Q113
median26
Q338
95-th percentile48
Maximum50
Range49
Interquartile range (IQR)25

Descriptive statistics

Standard deviation14.481071
Coefficient of variation (CV)0.56629209
Kurtosis-1.2080203
Mean25.571735
Median Absolute Deviation (MAD)13
Skewness-0.017317782
Sum214777
Variance209.70142
MonotonicityNot monotonic
2023-10-22T12:05:48.734715image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
31 202
 
2.4%
4 196
 
2.3%
39 195
 
2.3%
46 193
 
2.3%
23 192
 
2.3%
24 192
 
2.3%
3 189
 
2.3%
42 189
 
2.3%
43 184
 
2.2%
41 183
 
2.2%
Other values (40) 6484
77.2%
ValueCountFrequency (%)
1 165
2.0%
2 152
1.8%
3 189
2.3%
4 196
2.3%
5 166
2.0%
6 172
2.0%
7 174
2.1%
8 176
2.1%
9 155
1.8%
10 170
2.0%
ValueCountFrequency (%)
50 182
2.2%
49 136
1.6%
48 172
2.0%
47 166
2.0%
46 193
2.3%
45 163
1.9%
44 157
1.9%
43 184
2.2%
42 189
2.3%
41 183
2.2%

product_base_margin
Real number (ℝ)

Distinct51
Distinct (%)0.6%
Missing63
Missing (%)0.8%
Infinite0
Infinite (%)0.0%
Mean0.5125132
Minimum0.35
Maximum0.85
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size65.7 KiB
2023-10-22T12:05:49.059411image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/

Quantile statistics

Minimum0.35
5-th percentile0.36
Q10.38
median0.52
Q30.59
95-th percentile0.78
Maximum0.85
Range0.5
Interquartile range (IQR)0.21

Descriptive statistics

Standard deviation0.13558894
Coefficient of variation (CV)0.26455698
Kurtosis-0.66087023
Mean0.5125132
Median Absolute Deviation (MAD)0.12
Skewness0.55939959
Sum4272.31
Variance0.018384361
MonotonicityNot monotonic
2023-10-22T12:05:49.276306image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.37 761
 
9.1%
0.38 678
 
8.1%
0.36 628
 
7.5%
0.59 497
 
5.9%
0.39 482
 
5.7%
0.56 459
 
5.5%
0.57 459
 
5.5%
0.4 408
 
4.9%
0.58 387
 
4.6%
0.55 314
 
3.7%
Other values (41) 3263
38.8%
ValueCountFrequency (%)
0.35 262
 
3.1%
0.36 628
7.5%
0.37 761
9.1%
0.38 678
8.1%
0.39 482
5.7%
0.4 408
4.9%
0.41 98
 
1.2%
0.42 78
 
0.9%
0.43 101
 
1.2%
0.44 94
 
1.1%
ValueCountFrequency (%)
0.85 36
0.4%
0.84 25
 
0.3%
0.83 83
1.0%
0.82 32
 
0.4%
0.81 73
0.9%
0.8 48
0.6%
0.79 68
0.8%
0.78 89
1.1%
0.77 68
0.8%
0.76 55
0.7%

product_category
Categorical

HIGH CORRELATION 

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size65.7 KiB
Office Supplies
4610 
Technology
2065 
Furniture
1724 

Length

Max length15
Median length15
Mean length12.539112
Min length9

Characters and Unicode

Total characters105316
Distinct characters20
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowOffice Supplies
2nd rowOffice Supplies
3rd rowFurniture
4th rowOffice Supplies
5th rowFurniture

Common Values

ValueCountFrequency (%)
Office Supplies 4610
54.9%
Technology 2065
24.6%
Furniture 1724
 
20.5%

Length

2023-10-22T12:05:49.472898image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-10-22T12:05:49.634830image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
ValueCountFrequency (%)
office 4610
35.4%
supplies 4610
35.4%
technology 2065
15.9%
furniture 1724
 
13.3%

Most occurring characters

ValueCountFrequency (%)
e 13009
12.4%
i 10944
 
10.4%
p 9220
 
8.8%
f 9220
 
8.8%
u 8058
 
7.7%
c 6675
 
6.3%
l 6675
 
6.3%
O 4610
 
4.4%
s 4610
 
4.4%
S 4610
 
4.4%
Other values (10) 27685
26.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 87697
83.3%
Uppercase Letter 13009
 
12.4%
Space Separator 4610
 
4.4%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 13009
14.8%
i 10944
12.5%
p 9220
10.5%
f 9220
10.5%
u 8058
9.2%
c 6675
7.6%
l 6675
7.6%
s 4610
 
5.3%
o 4130
 
4.7%
n 3789
 
4.3%
Other values (5) 11367
13.0%
Uppercase Letter
ValueCountFrequency (%)
O 4610
35.4%
S 4610
35.4%
T 2065
15.9%
F 1724
 
13.3%
Space Separator
ValueCountFrequency (%)
4610
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 100706
95.6%
Common 4610
 
4.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 13009
12.9%
i 10944
10.9%
p 9220
 
9.2%
f 9220
 
9.2%
u 8058
 
8.0%
c 6675
 
6.6%
l 6675
 
6.6%
O 4610
 
4.6%
s 4610
 
4.6%
S 4610
 
4.6%
Other values (9) 23075
22.9%
Common
ValueCountFrequency (%)
4610
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 105316
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 13009
12.4%
i 10944
 
10.4%
p 9220
 
8.8%
f 9220
 
8.8%
u 8058
 
7.7%
c 6675
 
6.3%
l 6675
 
6.3%
O 4610
 
4.4%
s 4610
 
4.4%
S 4610
 
4.4%
Other values (10) 27685
26.3%

product_container
Categorical

HIGH CORRELATION 

Distinct7
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size65.7 KiB
Small Box
4347 
Wrap Bag
1168 
Small Pack
956 
Jumbo Drum
624 
Jumbo Box
532 
Other values (2)
772 

Length

Max length10
Median length9
Mean length9.0926301
Min length8

Characters and Unicode

Total characters76369
Distinct characters24
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowSmall Box
2nd rowLarge Box
3rd rowJumbo Drum
4th rowWrap Bag
5th rowJumbo Drum

Common Values

ValueCountFrequency (%)
Small Box 4347
51.8%
Wrap Bag 1168
 
13.9%
Small Pack 956
 
11.4%
Jumbo Drum 624
 
7.4%
Jumbo Box 532
 
6.3%
Large Box 406
 
4.8%
Medium Box 366
 
4.4%

Length

2023-10-22T12:05:49.823389image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-10-22T12:05:50.013342image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
ValueCountFrequency (%)
box 5651
33.6%
small 5303
31.6%
wrap 1168
 
7.0%
bag 1168
 
7.0%
jumbo 1156
 
6.9%
pack 956
 
5.7%
drum 624
 
3.7%
large 406
 
2.4%
medium 366
 
2.2%

Most occurring characters

ValueCountFrequency (%)
l 10606
13.9%
a 9001
11.8%
8399
11.0%
m 7449
9.8%
B 6819
8.9%
o 6807
8.9%
x 5651
7.4%
S 5303
6.9%
r 2198
 
2.9%
u 2146
 
2.8%
Other values (14) 11990
15.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 51172
67.0%
Uppercase Letter 16798
 
22.0%
Space Separator 8399
 
11.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
l 10606
20.7%
a 9001
17.6%
m 7449
14.6%
o 6807
13.3%
x 5651
11.0%
r 2198
 
4.3%
u 2146
 
4.2%
g 1574
 
3.1%
p 1168
 
2.3%
b 1156
 
2.3%
Other values (5) 3416
 
6.7%
Uppercase Letter
ValueCountFrequency (%)
B 6819
40.6%
S 5303
31.6%
W 1168
 
7.0%
J 1156
 
6.9%
P 956
 
5.7%
D 624
 
3.7%
L 406
 
2.4%
M 366
 
2.2%
Space Separator
ValueCountFrequency (%)
8399
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 67970
89.0%
Common 8399
 
11.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
l 10606
15.6%
a 9001
13.2%
m 7449
11.0%
B 6819
10.0%
o 6807
10.0%
x 5651
8.3%
S 5303
7.8%
r 2198
 
3.2%
u 2146
 
3.2%
g 1574
 
2.3%
Other values (13) 10416
15.3%
Common
ValueCountFrequency (%)
8399
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 76369
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
l 10606
13.9%
a 9001
11.8%
8399
11.0%
m 7449
9.8%
B 6819
8.9%
o 6807
8.9%
x 5651
7.4%
S 5303
6.9%
r 2198
 
2.9%
u 2146
 
2.8%
Other values (14) 11990
15.7%
Distinct1263
Distinct (%)15.0%
Missing0
Missing (%)0.0%
Memory size65.7 KiB
2023-10-22T12:05:50.330880image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/

Length

Max length98
Median length75
Mean length34.351709
Min length3

Characters and Unicode

Total characters288520
Distinct characters84
Distinct categories12 ?
Distinct scripts2 ?
Distinct blocks4 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique57 ?
Unique (%)0.7%

Sample

1st rowPerma STOR-ALL™ Hanging File Box, 13 1/8"W x 12 1/4"D x 10 1/2"H
2nd rowSafco Industrial Wire Shelving
3rd rowHon 4070 Series Pagoda™ Armless Upholstered Stacking Chairs
4th rowWhite GlueTop Scratch Pads
5th rowHon Valutask™ Swivel Chairs
ValueCountFrequency (%)
xerox 765
 
1.8%
x 499
 
1.2%
avery 418
 
1.0%
with 405
 
0.9%
black 338
 
0.8%
327
 
0.8%
binders 305
 
0.7%
for 302
 
0.7%
chair 276
 
0.6%
keyboard 268
 
0.6%
Other values (2076) 38861
90.9%
2023-10-22T12:05:50.894164image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
34365
 
11.9%
e 25862
 
9.0%
r 15875
 
5.5%
o 15517
 
5.4%
a 14627
 
5.1%
i 14001
 
4.9%
l 12489
 
4.3%
t 12482
 
4.3%
n 11792
 
4.1%
s 11438
 
4.0%
Other values (74) 120072
41.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 184442
63.9%
Uppercase Letter 42833
 
14.8%
Space Separator 34365
 
11.9%
Decimal Number 16531
 
5.7%
Other Punctuation 6372
 
2.2%
Dash Punctuation 2329
 
0.8%
Other Symbol 1374
 
0.5%
Final Punctuation 69
 
< 0.1%
Open Punctuation 68
 
< 0.1%
Close Punctuation 68
 
< 0.1%
Other values (2) 69
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 25862
14.0%
r 15875
 
8.6%
o 15517
 
8.4%
a 14627
 
7.9%
i 14001
 
7.6%
l 12489
 
6.8%
t 12482
 
6.8%
n 11792
 
6.4%
s 11438
 
6.2%
c 7441
 
4.0%
Other values (17) 42918
23.3%
Uppercase Letter
ValueCountFrequency (%)
S 4518
 
10.5%
C 4434
 
10.4%
P 4013
 
9.4%
B 3859
 
9.0%
D 2625
 
6.1%
A 2478
 
5.8%
M 2403
 
5.6%
F 2020
 
4.7%
T 1985
 
4.6%
R 1664
 
3.9%
Other values (16) 12834
30.0%
Decimal Number
ValueCountFrequency (%)
1 3209
19.4%
0 2724
16.5%
2 1984
12.0%
3 1528
9.2%
8 1410
8.5%
4 1406
8.5%
9 1255
 
7.6%
5 1142
 
6.9%
6 943
 
5.7%
7 930
 
5.6%
Other Punctuation
ValueCountFrequency (%)
, 2856
44.8%
/ 1359
21.3%
" 1060
 
16.6%
. 520
 
8.2%
& 216
 
3.4%
' 149
 
2.3%
# 104
 
1.6%
* 59
 
0.9%
% 45
 
0.7%
; 4
 
0.1%
Other Symbol
ValueCountFrequency (%)
® 896
65.2%
478
34.8%
Open Punctuation
ValueCountFrequency (%)
( 53
77.9%
[ 15
 
22.1%
Close Punctuation
ValueCountFrequency (%)
) 53
77.9%
] 15
 
22.1%
Space Separator
ValueCountFrequency (%)
34365
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 2329
100.0%
Final Punctuation
ValueCountFrequency (%)
69
100.0%
Math Symbol
ValueCountFrequency (%)
+ 40
100.0%
Initial Punctuation
ValueCountFrequency (%)
29
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 227275
78.8%
Common 61245
 
21.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 25862
 
11.4%
r 15875
 
7.0%
o 15517
 
6.8%
a 14627
 
6.4%
i 14001
 
6.2%
l 12489
 
5.5%
t 12482
 
5.5%
n 11792
 
5.2%
s 11438
 
5.0%
c 7441
 
3.3%
Other values (43) 85751
37.7%
Common
ValueCountFrequency (%)
34365
56.1%
1 3209
 
5.2%
, 2856
 
4.7%
0 2724
 
4.4%
- 2329
 
3.8%
2 1984
 
3.2%
3 1528
 
2.5%
8 1410
 
2.3%
4 1406
 
2.3%
/ 1359
 
2.2%
Other values (21) 8075
 
13.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 287046
99.5%
None 898
 
0.3%
Letterlike Symbols 478
 
0.2%
Punctuation 98
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
34365
 
12.0%
e 25862
 
9.0%
r 15875
 
5.5%
o 15517
 
5.4%
a 14627
 
5.1%
i 14001
 
4.9%
l 12489
 
4.4%
t 12482
 
4.3%
n 11792
 
4.1%
s 11438
 
4.0%
Other values (69) 118598
41.3%
None
ValueCountFrequency (%)
® 896
99.8%
à 2
 
0.2%
Letterlike Symbols
ValueCountFrequency (%)
478
100.0%
Punctuation
ValueCountFrequency (%)
69
70.4%
29
29.6%

product_sub_category
Categorical

HIGH CORRELATION 

Distinct17
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size65.7 KiB
Paper
1225 
Binders and Binder Accessories
915 
Telephones and Communication
883 
Office Furnishings
788 
Computer Peripherals
758 
Other values (12)
3830 

Length

Max length30
Median length20
Mean length17.080962
Min length5

Characters and Unicode

Total characters143463
Distinct characters37
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowStorage & Organization
2nd rowStorage & Organization
3rd rowChairs & Chairmats
4th rowPaper
5th rowChairs & Chairmats

Common Values

ValueCountFrequency (%)
Paper 1225
14.6%
Binders and Binder Accessories 915
10.9%
Telephones and Communication 883
10.5%
Office Furnishings 788
9.4%
Computer Peripherals 758
9.0%
Pens & Art Supplies 633
7.5%
Storage & Organization 546
 
6.5%
Appliances 434
 
5.2%
Chairs & Chairmats 386
 
4.6%
Tables 361
 
4.3%
Other values (7) 1470
17.5%

Length

2023-10-22T12:05:51.107395image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
and 2029
 
10.5%
1565
 
8.1%
paper 1225
 
6.3%
office 1125
 
5.8%
binders 915
 
4.7%
binder 915
 
4.7%
accessories 915
 
4.7%
telephones 883
 
4.6%
communication 883
 
4.6%
furnishings 788
 
4.1%
Other values (22) 8098
41.9%

Most occurring characters

ValueCountFrequency (%)
e 15400
 
10.7%
s 11945
 
8.3%
i 11613
 
8.1%
n 11005
 
7.7%
10942
 
7.6%
r 10371
 
7.2%
a 9566
 
6.7%
o 6269
 
4.4%
p 6091
 
4.2%
c 4942
 
3.4%
Other values (27) 45319
31.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 115065
80.2%
Uppercase Letter 15747
 
11.0%
Space Separator 10942
 
7.6%
Other Punctuation 1709
 
1.2%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 15400
13.4%
s 11945
10.4%
i 11613
10.1%
n 11005
9.6%
r 10371
9.0%
a 9566
8.3%
o 6269
 
5.4%
p 6091
 
5.3%
c 4942
 
4.3%
d 4038
 
3.5%
Other values (12) 23825
20.7%
Uppercase Letter
ValueCountFrequency (%)
P 2616
16.6%
C 2500
15.9%
B 2198
14.0%
A 1982
12.6%
O 1671
10.6%
T 1388
8.8%
S 1323
8.4%
F 875
 
5.6%
M 337
 
2.1%
R 323
 
2.1%
Other values (2) 534
 
3.4%
Other Punctuation
ValueCountFrequency (%)
& 1565
91.6%
, 144
 
8.4%
Space Separator
ValueCountFrequency (%)
10942
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 130812
91.2%
Common 12651
 
8.8%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 15400
 
11.8%
s 11945
 
9.1%
i 11613
 
8.9%
n 11005
 
8.4%
r 10371
 
7.9%
a 9566
 
7.3%
o 6269
 
4.8%
p 6091
 
4.7%
c 4942
 
3.8%
d 4038
 
3.1%
Other values (24) 39572
30.3%
Common
ValueCountFrequency (%)
10942
86.5%
& 1565
 
12.4%
, 144
 
1.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 143463
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 15400
 
10.7%
s 11945
 
8.3%
i 11613
 
8.1%
n 11005
 
7.7%
10942
 
7.6%
r 10371
 
7.2%
a 9566
 
6.7%
o 6269
 
4.4%
p 6091
 
4.2%
c 4942
 
3.4%
Other values (27) 45319
31.6%

profit
Real number (ℝ)

Distinct7967
Distinct (%)94.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean181.18442
Minimum-14140.702
Maximum27220.69
Zeros0
Zeros (%)0.0%
Negative4264
Negative (%)50.8%
Memory size65.7 KiB
2023-10-22T12:05:51.302673image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/

Quantile statistics

Minimum-14140.702
5-th percentile-592.43905
Q1-83.315
median-1.5
Q3162.748
95-th percentile1542.309
Maximum27220.69
Range41361.392
Interquartile range (IQR)246.063

Descriptive statistics

Standard deviation1196.6533
Coefficient of variation (CV)6.6046149
Kurtosis67.34971
Mean181.18442
Median Absolute Deviation (MAD)104.3345
Skewness3.6472388
Sum1521768
Variance1431979.2
MonotonicityNot monotonic
2023-10-22T12:05:51.504201image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
-969.048366 8
 
0.1%
-433.290143 6
 
0.1%
-505.984479 5
 
0.1%
-715.778206 5
 
0.1%
-1331.553366 5
 
0.1%
-528.653125 5
 
0.1%
11.65095 5
 
0.1%
-66.87 4
 
< 0.1%
-22.82 4
 
< 0.1%
-513.79042 4
 
< 0.1%
Other values (7957) 8348
99.4%
ValueCountFrequency (%)
-14140.7016 1
< 0.1%
-12557.9976 1
< 0.1%
-11984.3979 1
< 0.1%
-11861.46 1
< 0.1%
-11769.17 1
< 0.1%
-11053.6 1
< 0.1%
-10263.6597 1
< 0.1%
-9611.91 1
< 0.1%
-9078.94 1
< 0.1%
-8570.4483 1
< 0.1%
ValueCountFrequency (%)
27220.69 1
< 0.1%
14440.39 1
< 0.1%
13340.26 1
< 0.1%
12748.86 1
< 0.1%
12606.81 1
< 0.1%
11984.395 1
< 0.1%
11630.146 1
< 0.1%
11562.08 1
< 0.1%
11535.282 1
< 0.1%
10951.3065 1
< 0.1%

region
Categorical

HIGH CORRELATION 

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size65.7 KiB
Central
2710 
West
1956 
East
1895 
South
1838 

Length

Max length7
Median length5
Mean length5.186808
Min length4

Characters and Unicode

Total characters43564
Distinct characters14
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowEast
2nd rowEast
3rd rowWest
4th rowWest
5th rowWest

Common Values

ValueCountFrequency (%)
Central 2710
32.3%
West 1956
23.3%
East 1895
22.6%
South 1838
21.9%

Length

2023-10-22T12:05:51.690499image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-10-22T12:05:51.858737image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
ValueCountFrequency (%)
central 2710
32.3%
west 1956
23.3%
east 1895
22.6%
south 1838
21.9%

Most occurring characters

ValueCountFrequency (%)
t 8399
19.3%
e 4666
10.7%
a 4605
10.6%
s 3851
8.8%
C 2710
 
6.2%
n 2710
 
6.2%
r 2710
 
6.2%
l 2710
 
6.2%
W 1956
 
4.5%
E 1895
 
4.3%
Other values (4) 7352
16.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 35165
80.7%
Uppercase Letter 8399
 
19.3%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
t 8399
23.9%
e 4666
13.3%
a 4605
13.1%
s 3851
11.0%
n 2710
 
7.7%
r 2710
 
7.7%
l 2710
 
7.7%
o 1838
 
5.2%
u 1838
 
5.2%
h 1838
 
5.2%
Uppercase Letter
ValueCountFrequency (%)
C 2710
32.3%
W 1956
23.3%
E 1895
22.6%
S 1838
21.9%

Most occurring scripts

ValueCountFrequency (%)
Latin 43564
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
t 8399
19.3%
e 4666
10.7%
a 4605
10.6%
s 3851
8.8%
C 2710
 
6.2%
n 2710
 
6.2%
r 2710
 
6.2%
l 2710
 
6.2%
W 1956
 
4.5%
E 1895
 
4.3%
Other values (4) 7352
16.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 43564
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
t 8399
19.3%
e 4666
10.7%
a 4605
10.6%
s 3851
8.8%
C 2710
 
6.2%
n 2710
 
6.2%
r 2710
 
6.2%
l 2710
 
6.2%
W 1956
 
4.5%
E 1895
 
4.3%
Other values (4) 7352
16.9%

row_id
Real number (ℝ)

HIGH CORRELATION  UNIFORM  UNIQUE 

Distinct8399
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4200
Minimum1
Maximum8399
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size65.7 KiB
2023-10-22T12:05:52.059423image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile420.9
Q12100.5
median4200
Q36299.5
95-th percentile7979.1
Maximum8399
Range8398
Interquartile range (IQR)4199

Descriptive statistics

Standard deviation2424.7268
Coefficient of variation (CV)0.5773159
Kurtosis-1.2
Mean4200
Median Absolute Deviation (MAD)2100
Skewness0
Sum35275800
Variance5879300
MonotonicityNot monotonic
2023-10-22T12:05:52.270742image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
4031 1
 
< 0.1%
671 1
 
< 0.1%
669 1
 
< 0.1%
672 1
 
< 0.1%
4860 1
 
< 0.1%
883 1
 
< 0.1%
5148 1
 
< 0.1%
1554 1
 
< 0.1%
5426 1
 
< 0.1%
6973 1
 
< 0.1%
Other values (8389) 8389
99.9%
ValueCountFrequency (%)
1 1
< 0.1%
2 1
< 0.1%
3 1
< 0.1%
4 1
< 0.1%
5 1
< 0.1%
6 1
< 0.1%
7 1
< 0.1%
8 1
< 0.1%
9 1
< 0.1%
10 1
< 0.1%
ValueCountFrequency (%)
8399 1
< 0.1%
8398 1
< 0.1%
8397 1
< 0.1%
8396 1
< 0.1%
8395 1
< 0.1%
8394 1
< 0.1%
8393 1
< 0.1%
8392 1
< 0.1%
8391 1
< 0.1%
8390 1
< 0.1%

sales
Real number (ℝ)

HIGH CORRELATION 

Distinct8153
Distinct (%)97.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1775.8782
Minimum2.24
Maximum89061.05
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size65.7 KiB
2023-10-22T12:05:52.475106image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/

Quantile statistics

Minimum2.24
5-th percentile34.178
Q1143.195
median449.42
Q31709.32
95-th percentile7844.335
Maximum89061.05
Range89058.81
Interquartile range (IQR)1566.125

Descriptive statistics

Standard deviation3585.0505
Coefficient of variation (CV)2.018748
Kurtosis60.928376
Mean1775.8782
Median Absolute Deviation (MAD)381.95
Skewness5.3869824
Sum14915601
Variance12852587
MonotonicityNot monotonic
2023-10-22T12:05:52.685808image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
75.19 3
 
< 0.1%
19.36 3
 
< 0.1%
20.19 3
 
< 0.1%
74.02 3
 
< 0.1%
10.48 3
 
< 0.1%
224.58 3
 
< 0.1%
43.29 3
 
< 0.1%
151.19 3
 
< 0.1%
127.56 3
 
< 0.1%
115.81 3
 
< 0.1%
Other values (8143) 8369
99.6%
ValueCountFrequency (%)
2.24 1
< 0.1%
3.2 1
< 0.1%
3.23 1
< 0.1%
3.41 1
< 0.1%
3.42 1
< 0.1%
3.63 1
< 0.1%
3.77 1
< 0.1%
3.85 1
< 0.1%
3.96 1
< 0.1%
4.94 1
< 0.1%
ValueCountFrequency (%)
89061.05 1
< 0.1%
45923.76 1
< 0.1%
41343.21 1
< 0.1%
33367.85 1
< 0.1%
29884.6 1
< 0.1%
29345.27 1
< 0.1%
29186.49 1
< 0.1%
28761.52 1
< 0.1%
28664.52 1
< 0.1%
28389.14 1
< 0.1%
Distinct1450
Distinct (%)17.3%
Missing0
Missing (%)0.0%
Memory size65.7 KiB
Minimum2012-01-02 00:00:00
Maximum2015-12-30 00:00:00
2023-10-22T12:05:52.893813image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-22T12:05:53.114279image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

ship_mode
Categorical

HIGH CORRELATION 

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size65.7 KiB
Regular Air
6270 
Delivery Truck
1146 
Express Air
983 

Length

Max length14
Median length11
Mean length11.409334
Min length11

Characters and Unicode

Total characters95827
Distinct characters20
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowRegular Air
2nd rowExpress Air
3rd rowDelivery Truck
4th rowRegular Air
5th rowDelivery Truck

Common Values

ValueCountFrequency (%)
Regular Air 6270
74.7%
Delivery Truck 1146
 
13.6%
Express Air 983
 
11.7%

Length

2023-10-22T12:05:53.430083image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-10-22T12:05:53.598810image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
ValueCountFrequency (%)
air 7253
43.2%
regular 6270
37.3%
delivery 1146
 
6.8%
truck 1146
 
6.8%
express 983
 
5.9%

Most occurring characters

ValueCountFrequency (%)
r 16798
17.5%
e 9545
10.0%
8399
8.8%
i 8399
8.8%
u 7416
7.7%
l 7416
7.7%
A 7253
7.6%
R 6270
 
6.5%
g 6270
 
6.5%
a 6270
 
6.5%
Other values (10) 11791
12.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 70630
73.7%
Uppercase Letter 16798
 
17.5%
Space Separator 8399
 
8.8%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
r 16798
23.8%
e 9545
13.5%
i 8399
11.9%
u 7416
10.5%
l 7416
10.5%
g 6270
 
8.9%
a 6270
 
8.9%
s 1966
 
2.8%
v 1146
 
1.6%
y 1146
 
1.6%
Other values (4) 4258
 
6.0%
Uppercase Letter
ValueCountFrequency (%)
A 7253
43.2%
R 6270
37.3%
T 1146
 
6.8%
D 1146
 
6.8%
E 983
 
5.9%
Space Separator
ValueCountFrequency (%)
8399
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 87428
91.2%
Common 8399
 
8.8%

Most frequent character per script

Latin
ValueCountFrequency (%)
r 16798
19.2%
e 9545
10.9%
i 8399
9.6%
u 7416
8.5%
l 7416
8.5%
A 7253
8.3%
R 6270
 
7.2%
g 6270
 
7.2%
a 6270
 
7.2%
s 1966
 
2.2%
Other values (9) 9825
11.2%
Common
ValueCountFrequency (%)
8399
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 95827
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
r 16798
17.5%
e 9545
10.0%
8399
8.8%
i 8399
8.8%
u 7416
7.7%
l 7416
7.7%
A 7253
7.6%
R 6270
 
6.5%
g 6270
 
6.5%
a 6270
 
6.5%
Other values (10) 11791
12.3%

shipping_cost
Real number (ℝ)

HIGH CORRELATION 

Distinct652
Distinct (%)7.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean12.838557
Minimum0.49
Maximum164.73
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size65.7 KiB
2023-10-22T12:05:53.789288image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/

Quantile statistics

Minimum0.49
5-th percentile0.8
Q13.3
median6.07
Q313.99
95-th percentile55.351
Maximum164.73
Range164.24
Interquartile range (IQR)10.69

Descriptive statistics

Standard deviation17.264052
Coefficient of variation (CV)1.3447035
Kurtosis7.7515872
Mean12.838557
Median Absolute Deviation (MAD)3.61
Skewness2.5538008
Sum107831.04
Variance298.04749
MonotonicityNot monotonic
2023-10-22T12:05:54.002359image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
19.99 352
 
4.2%
8.99 321
 
3.8%
1.99 247
 
2.9%
0.5 190
 
2.3%
0.99 144
 
1.7%
4 143
 
1.7%
1.49 138
 
1.6%
0.7 138
 
1.6%
24.49 132
 
1.6%
2.99 124
 
1.5%
Other values (642) 6470
77.0%
ValueCountFrequency (%)
0.49 34
 
0.4%
0.5 190
2.3%
0.7 138
1.6%
0.71 22
 
0.3%
0.73 1
 
< 0.1%
0.75 7
 
0.1%
0.76 7
 
0.1%
0.78 7
 
0.1%
0.79 3
 
< 0.1%
0.8 24
 
0.3%
ValueCountFrequency (%)
164.73 1
 
< 0.1%
154.12 1
 
< 0.1%
147.12 2
 
< 0.1%
143.71 1
 
< 0.1%
130 1
 
< 0.1%
126 1
 
< 0.1%
110.2 10
0.1%
99 7
0.1%
91.05 5
 
0.1%
89.3 13
0.2%

state
Categorical

HIGH CORRELATION 

Distinct48
Distinct (%)0.6%
Missing0
Missing (%)0.0%
Memory size65.7 KiB
California
780 
Texas
577 
Illinois
 
500
Florida
 
479
Ohio
 
396
Other values (43)
5667 

Length

Max length14
Median length12
Mean length7.7852125
Min length2

Characters and Unicode

Total characters65388
Distinct characters46
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowPennsylvania
2nd rowMaryland
3rd rowCalifornia
4th rowCalifornia
5th rowCalifornia

Common Values

ValueCountFrequency (%)
California 780
 
9.3%
Texas 577
 
6.9%
Illinois 500
 
6.0%
Florida 479
 
5.7%
Ohio 396
 
4.7%
New York 372
 
4.4%
Michigan 291
 
3.5%
Indiana 241
 
2.9%
Washington 240
 
2.9%
Minnesota 239
 
2.8%
Other values (38) 4284
51.0%

Length

2023-10-22T12:05:54.208791image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
california 780
 
8.2%
new 687
 
7.2%
texas 577
 
6.1%
illinois 500
 
5.2%
florida 479
 
5.0%
ohio 396
 
4.2%
york 372
 
3.9%
carolina 316
 
3.3%
michigan 291
 
3.1%
north 245
 
2.6%
Other values (40) 4884
51.3%

Most occurring characters

ValueCountFrequency (%)
a 8410
12.9%
i 7507
 
11.5%
n 6295
 
9.6%
o 5725
 
8.8%
e 3794
 
5.8%
r 3778
 
5.8%
s 3729
 
5.7%
l 3413
 
5.2%
h 1743
 
2.7%
t 1465
 
2.2%
Other values (36) 19529
29.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 54422
83.2%
Uppercase Letter 9838
 
15.0%
Space Separator 1128
 
1.7%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 8410
15.5%
i 7507
13.8%
n 6295
11.6%
o 5725
10.5%
e 3794
7.0%
r 3778
6.9%
s 3729
6.9%
l 3413
 
6.3%
h 1743
 
3.2%
t 1465
 
2.7%
Other values (14) 8563
15.7%
Uppercase Letter
ValueCountFrequency (%)
M 1358
13.8%
C 1355
13.8%
N 1052
10.7%
I 1031
10.5%
O 829
8.4%
T 743
7.6%
A 532
 
5.4%
F 479
 
4.9%
W 473
 
4.8%
Y 372
 
3.8%
Other values (11) 1614
16.4%
Space Separator
ValueCountFrequency (%)
1128
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 64260
98.3%
Common 1128
 
1.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 8410
13.1%
i 7507
11.7%
n 6295
 
9.8%
o 5725
 
8.9%
e 3794
 
5.9%
r 3778
 
5.9%
s 3729
 
5.8%
l 3413
 
5.3%
h 1743
 
2.7%
t 1465
 
2.3%
Other values (35) 18401
28.6%
Common
ValueCountFrequency (%)
1128
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 65388
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 8410
12.9%
i 7507
 
11.5%
n 6295
 
9.6%
o 5725
 
8.8%
e 3794
 
5.8%
r 3778
 
5.8%
s 3729
 
5.7%
l 3413
 
5.2%
h 1743
 
2.7%
t 1465
 
2.2%
Other values (36) 19529
29.9%

unit_price
Real number (ℝ)

HIGH CORRELATION 

Distinct751
Distinct (%)8.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean89.346259
Minimum0.99
Maximum6783.02
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size65.7 KiB
2023-10-22T12:05:54.402048image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/

Quantile statistics

Minimum0.99
5-th percentile2.88
Q16.48
median20.99
Q385.99
95-th percentile320.64
Maximum6783.02
Range6782.03
Interquartile range (IQR)79.51

Descriptive statistics

Standard deviation290.35438
Coefficient of variation (CV)3.2497654
Kurtosis271.16873
Mean89.346259
Median Absolute Deviation (MAD)17.01
Skewness14.127793
Sum750419.23
Variance84305.668
MonotonicityNot monotonic
2023-10-22T12:05:54.618289image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
6.48 264
 
3.1%
65.99 192
 
2.3%
4.98 136
 
1.6%
125.99 115
 
1.4%
5.98 102
 
1.2%
2.88 81
 
1.0%
20.99 73
 
0.9%
30.98 73
 
0.9%
35.99 70
 
0.8%
205.99 66
 
0.8%
Other values (741) 7227
86.0%
ValueCountFrequency (%)
0.99 2
 
< 0.1%
1.14 10
 
0.1%
1.26 13
0.2%
1.48 12
0.1%
1.6 5
 
0.1%
1.68 22
0.3%
1.7 8
 
0.1%
1.74 9
 
0.1%
1.76 29
0.3%
1.8 3
 
< 0.1%
ValueCountFrequency (%)
6783.02 7
0.1%
3502.14 6
0.1%
3499.99 7
0.1%
2550.14 7
0.1%
2036.48 6
0.1%
1938.02 8
0.1%
1889.99 3
 
< 0.1%
1637.53 2
 
< 0.1%
1500.97 5
0.1%
1360.14 3
 
< 0.1%

zip_code
Real number (ℝ)

HIGH CORRELATION 

Distinct1626
Distinct (%)19.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean52839.139
Minimum1001
Maximum99362
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size65.7 KiB
2023-10-22T12:05:54.835173image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/

Quantile statistics

Minimum1001
5-th percentile6032.7
Q130337
median52732
Q377577
95-th percentile95992.2
Maximum99362
Range98361
Interquartile range (IQR)47240

Descriptive statistics

Standard deviation28509.536
Coefficient of variation (CV)0.53955337
Kurtosis-1.1269582
Mean52839.139
Median Absolute Deviation (MAD)23321
Skewness-0.055889262
Sum4.4379593 × 108
Variance8.1279362 × 108
MonotonicityNot monotonic
2023-10-22T12:05:55.050626image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
94110 22
 
0.3%
92277 22
 
0.3%
88201 21
 
0.3%
81301 20
 
0.2%
55372 19
 
0.2%
59715 19
 
0.2%
87105 18
 
0.2%
46203 17
 
0.2%
4401 17
 
0.2%
21222 15
 
0.2%
Other values (1616) 8209
97.7%
ValueCountFrequency (%)
1001 1
< 0.1%
1007 1
< 0.1%
1013 1
< 0.1%
1027 1
< 0.1%
1028 1
< 0.1%
1040 1
< 0.1%
1056 1
< 0.1%
1060 1
< 0.1%
1069 1
< 0.1%
1075 2
< 0.1%
ValueCountFrequency (%)
99362 8
0.1%
99352 5
0.1%
99336 6
0.1%
99301 6
0.1%
99207 7
0.1%
99163 7
0.1%
98902 3
 
< 0.1%
98801 7
0.1%
98661 8
0.1%
98632 6
0.1%

Interactions

2023-10-22T12:05:41.597754image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-22T12:05:26.228795image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-22T12:05:27.781029image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-22T12:05:29.288682image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-22T12:05:30.931744image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-22T12:05:32.435227image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-22T12:05:33.946941image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-22T12:05:35.396468image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-22T12:05:36.875455image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-22T12:05:38.451329image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-22T12:05:40.001861image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-22T12:05:41.735776image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-22T12:05:26.365514image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-22T12:05:27.918417image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-22T12:05:29.429410image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-22T12:05:31.066387image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-22T12:05:32.573712image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-22T12:05:34.082262image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-22T12:05:35.532779image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-22T12:05:37.010092image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-22T12:05:38.598603image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-22T12:05:40.157301image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-22T12:05:41.878203image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-22T12:05:26.509979image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-22T12:05:28.056000image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-22T12:05:29.569996image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-22T12:05:31.205620image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-22T12:05:32.711548image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-22T12:05:34.215665image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-22T12:05:35.670367image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-22T12:05:37.145238image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-22T12:05:38.741663image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-22T12:05:40.316969image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-22T12:05:42.019402image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-22T12:05:26.658831image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-22T12:05:28.200860image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-22T12:05:29.716766image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-22T12:05:31.347538image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-22T12:05:32.857082image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-22T12:05:34.350991image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-22T12:05:35.811416image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-22T12:05:37.282582image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-22T12:05:38.887218image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-22T12:05:40.465350image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-22T12:05:42.157876image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-22T12:05:26.800756image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-22T12:05:28.338703image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-22T12:05:29.857131image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-22T12:05:31.482633image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-22T12:05:32.994976image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-22T12:05:34.481590image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-22T12:05:35.945171image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-22T12:05:37.417718image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-22T12:05:39.029253image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-22T12:05:40.609914image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-22T12:05:42.299177image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-22T12:05:26.938797image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-22T12:05:28.471712image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-22T12:05:29.996388image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-22T12:05:31.619053image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-22T12:05:33.128868image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-22T12:05:34.612147image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-22T12:05:36.079278image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-22T12:05:37.656001image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-22T12:05:39.167197image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-22T12:05:40.749806image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-22T12:05:42.424566image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-22T12:05:27.071460image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-22T12:05:28.598068image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-22T12:05:30.124966image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-22T12:05:31.748053image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-22T12:05:33.254332image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-22T12:05:34.730039image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-22T12:05:36.202846image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-22T12:05:37.779713image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-22T12:05:39.294331image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-22T12:05:40.881586image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-22T12:05:42.556909image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-22T12:05:27.206004image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-22T12:05:28.729242image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-22T12:05:30.258085image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-22T12:05:31.878174image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-22T12:05:33.387710image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-22T12:05:34.857689image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-22T12:05:36.330476image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-22T12:05:37.907524image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-22T12:05:39.429648image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-22T12:05:41.015662image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-22T12:05:42.686477image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-22T12:05:27.338639image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-22T12:05:28.861977image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-22T12:05:30.391284image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-22T12:05:32.009906image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-22T12:05:33.519007image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-22T12:05:34.984108image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-22T12:05:36.456583image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-22T12:05:38.034773image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-22T12:05:39.563929image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-22T12:05:41.153608image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-22T12:05:42.829944image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-22T12:05:27.492282image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-22T12:05:29.003763image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-22T12:05:30.537668image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-22T12:05:32.150492image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-22T12:05:33.662130image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-22T12:05:35.121658image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-22T12:05:36.596380image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-22T12:05:38.173936image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-22T12:05:39.709291image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-22T12:05:41.299847image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-22T12:05:42.975475image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-22T12:05:27.641709image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-22T12:05:29.152328image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-22T12:05:30.688444image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-22T12:05:32.298666image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-22T12:05:33.809509image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-22T12:05:35.262935image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-22T12:05:36.740328image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-22T12:05:38.319031image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-22T12:05:39.860727image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
2023-10-22T12:05:41.454993image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/

Correlations

2023-10-22T12:05:55.214819image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
customer_agediscountorder_idorder_quantityproduct_base_marginprofitrow_idsalesshipping_costunit_pricezip_codecustomer_segmentorder_priorityproduct_categoryproduct_containerproduct_sub_categoryregionship_modestate
customer_age1.0000.0170.0160.019-0.0180.0120.0160.009-0.0020.0000.0030.0100.0370.0000.0000.0000.0130.0180.062
discount0.0171.000-0.002-0.009-0.001-0.073-0.002-0.022-0.003-0.001-0.0080.0050.0150.0270.0230.0000.0000.0000.000
order_id0.016-0.0021.0000.011-0.0200.0091.000-0.011-0.004-0.018-0.0050.0270.0410.0000.0010.0000.0410.0000.065
order_quantity0.019-0.0090.0111.0000.0110.2380.0110.399-0.024-0.0300.0050.0000.0140.0000.0000.0170.0120.0000.018
product_base_margin-0.018-0.001-0.0200.0111.000-0.207-0.0200.3490.2910.392-0.0060.0160.0140.4420.2910.4220.0160.3360.017
profit0.012-0.0730.0090.238-0.2071.0000.0090.325-0.1930.2270.0130.0000.0210.1030.1310.1680.0210.1490.017
row_id0.016-0.0021.0000.011-0.0200.0091.000-0.011-0.004-0.018-0.0050.0300.0390.0000.0000.0000.0410.0000.063
sales0.009-0.022-0.0110.3990.3490.325-0.0111.0000.5870.877-0.0000.0000.0000.1120.1490.1970.0000.2140.000
shipping_cost-0.002-0.003-0.004-0.0240.291-0.193-0.0040.5871.0000.652-0.0080.0000.0200.3630.3740.3060.0190.5180.000
unit_price0.000-0.001-0.018-0.0300.3920.227-0.0180.8770.6521.000-0.0070.0110.0120.0940.1210.1970.0210.0880.000
zip_code0.003-0.008-0.0050.005-0.0060.013-0.005-0.000-0.008-0.0071.0000.1170.0360.0000.0140.0000.8850.0000.970
customer_segment0.0100.0050.0270.0000.0160.0000.0300.0000.0000.0110.1171.0000.0000.0060.0000.0000.0670.0000.201
order_priority0.0370.0150.0410.0140.0140.0210.0390.0000.0200.0120.0360.0001.0000.0000.0050.0000.0030.0030.064
product_category0.0000.0270.0000.0000.4420.1030.0000.1120.3630.0940.0000.0060.0001.0000.4910.9990.0190.3820.019
product_container0.0000.0230.0010.0000.2910.1310.0000.1490.3740.1210.0140.0000.0050.4911.0000.6540.0320.7030.026
product_sub_category0.0000.0000.0000.0170.4220.1680.0000.1970.3060.1970.0000.0000.0000.9990.6541.0000.0410.6110.005
region0.0130.0000.0410.0120.0160.0210.0410.0000.0190.0210.8850.0670.0030.0190.0320.0411.0000.0110.997
ship_mode0.0180.0000.0000.0000.3360.1490.0000.2140.5180.0880.0000.0000.0030.3820.7030.6110.0111.0000.033
state0.0620.0000.0650.0180.0170.0170.0630.0000.0000.0000.9700.2010.0640.0190.0260.0050.9970.0331.000

Missing values

2023-10-22T12:05:43.325949image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
A simple visualization of nullity by column.
2023-10-22T12:05:43.816157image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-10-22T12:05:44.076189image/svg+xmlMatplotlib v3.7.3, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

citycustomer_agecustomer_namecustomer_segmentdiscountnumber_of_recordsorder_dateorder_idorder_priorityorder_quantityproduct_base_marginproduct_categoryproduct_containerproduct_nameproduct_sub_categoryprofitregionrow_idsalesship_dateship_modeshipping_coststateunit_pricezip_code
0McKeesportNaNJessica MyrickSmall Business0.1012012-01-0128774High320.68Office SuppliesSmall BoxPerma STOR-ALL™ Hanging File Box, 13 1/8"W x 12 1/4"D x 10 1/2"HStorage & Organization-111.800East4031180.362012-01-02Regular Air4.69Pennsylvania5.9815131
1BowieNaNMatt CollisterHome Office0.0812012-01-0113729Not Specified9NaNOffice SuppliesLarge BoxSafco Industrial Wire ShelvingStorage & Organization-342.910East1914872.482012-01-03Express Air35.00Maryland95.9920715
2NapaNaNAlan SchoenbergerCorporate0.0012012-01-0237537Low40.56FurnitureJumbo DrumHon 4070 Series Pagoda™ Armless Upholstered Stacking ChairsChairs & Chairmats-193.080West52721239.062012-01-02Delivery Truck48.80California291.7394559
3MontebelloNaNElizabeth MoffittConsumer0.0812012-01-0244069Critical430.39Office SuppliesWrap BagWhite GlueTop Scratch PadsPaper247.790West6225614.802012-01-02Regular Air1.97California15.0490640
4NapaNaNAlan SchoenbergerCorporate0.0712012-01-0237537Low430.69FurnitureJumbo DrumHon Valutask™ Swivel ChairsChairs & Chairmats-1049.850West52734083.192012-01-04Delivery Truck45.00California100.9894559
5MontebelloNaNElizabeth MoffittConsumer0.0912012-01-0244069Critical160.40Office SuppliesWrap BagBlack Print Carbonless Snap-Off® Rapid Letter, 8 1/2" x 7"Paper26.710West6224137.632012-01-04Express Air2.15California9.1190640
6Prior LakeNaNDavid PhilippeConsumer0.0612012-01-029285Critical30.36Office SuppliesSmall BoxAvery Trapezoid Ring Binder, 3" Capacity, Black, 1040 sheetsBinders and Binder Accessories-11.937Central1279124.812012-01-04Regular Air2.99Minnesota40.9855372
7NapaNaNAlan SchoenbergerCorporate0.0512012-01-0237537Low320.59Office SuppliesSmall BoxDual Level, Single-Width Filing CartsStorage & Organization1438.490West52744902.382012-01-09Regular Air7.07California155.0694559
8Phenix CityNaNPatrick JonesHome Office0.0912012-01-0340354High40.64FurnitureJumbo BoxBush Advantage Collection® Round Conference TableTables-93.160South5705698.002012-01-04Delivery Truck52.20Alabama212.6036869
9DraperNaNLarry TronHome Office0.0512012-01-039762High120.78FurnitureMedium Box36X48 HARDFLOOR CHAIRMATOffice Furnishings-146.050West1336262.762012-01-04Regular Air21.20Utah20.9884020
citycustomer_agecustomer_namecustomer_segmentdiscountnumber_of_recordsorder_dateorder_idorder_priorityorder_quantityproduct_base_marginproduct_categoryproduct_containerproduct_nameproduct_sub_categoryprofitregionrow_idsalesship_dateship_modeshipping_coststateunit_pricezip_code
8389Scarsdale95.0Shirley SchmidtCorporate0.0512015-12-2953730High400.36Office SuppliesSmall BoxAvery® Durable Plastic 1" BindersBinders and Binder Accessories-144.739East7524181.802015-12-30Regular Air5.83New York4.5410583
8390Olathe88.0Anna AndreadiSmall Business0.0912015-12-2913507Medium270.39Office SuppliesSmall BoxStrathmore Photo Mount CardsPaper-75.710Central1876176.102015-12-30Regular Air6.18Kansas6.7866062
8391Horn Lake88.0Jennifer JacksonHome Office0.1012015-12-2929216Critical460.64TechnologySmall BoxFellowes Mobile Numeric Keypad, GraphiteComputer Peripherals307.170South41001936.452015-12-30Regular Air4.00Mississippi43.2238637
8392Fairfield95.0Tony MolinariCorporate0.0612015-12-3050950Not Specified60.70FurnitureJumbo DrumNovimex Fabric Task ChairChairs & Chairmats-166.960West7141391.122015-12-30Delivery Truck30.00California60.9894533
8393Charlottesville95.0Jim EppSmall Business0.0812015-12-3047815Not Specified450.54FurnitureWrap BagDAX Wood Document Frame.Office Furnishings-33.470South6712580.962015-12-30Regular Air6.85Virginia13.7322901
8394Fairfield95.0Tony MolinariCorporate0.1012015-12-3050950Not Specified350.59Office SuppliesSmall BoxTenex Personal Project File with Scoop Front Design, BlackStorage & Organization-15.070West7142448.102015-12-30Express Air4.51California13.4894533
8395Harker Heights95.0Matt HagelsteinHome Office0.0912015-12-3025542Low370.39Office SuppliesWrap BagBlack Print Carbonless 8 1/2" x 8 1/4" Rapid Memo BookPaper-18.660Central3583257.462015-12-30Express Air4.23Texas7.2876543
8396Riverview95.0Theresa SwintConsumer0.1012015-12-3045127Medium100.37Office SuppliesWrap BagBinder Clips by OICRubber Bands-1.290South636114.152015-12-30Regular Air0.70Florida1.4833569
8397Nicholasville95.0Maribeth YedwabHome Office0.0912015-12-3049344Low10.83Office SuppliesMedium BoxMartin Yale Chadless Opener Electric Letter OpenerScissors, Rulers and Trimmers-745.200South6916803.332015-12-30Regular Air24.49Kentucky832.8140356
8398Nicholasville95.0Maribeth YedwabHome Office0.0012015-12-3049344Low310.68TechnologySmall BoxBelkin 105-Key Black KeyboardComputer Peripherals27.850South6915672.932015-12-30Regular Air4.00Kentucky19.9840356